Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow more types of dictionary keys in overrides grammar #1208

Merged
merged 16 commits into from
Dec 20, 2020

Conversation

odelalleau
Copy link
Collaborator

@odelalleau odelalleau commented Dec 14, 2020

Motivation

Requested in omry/omegaconf@d959a13#r45076606

Remarks & questions

  1. I wasn't entirely sure where to put this test: https://github.com/odelalleau/hydra/blob/ccde9814d66042f4c6b9fe7ec3db77e253297265/tests/test_hydra.py#L1190 . It's important to have it in addition to the grammar tests in test_overrides_parser because it also tests the config merge (in particular, it matters for quoted strings). Also ideally this would have been a parameterized test, but since integration_test() is slow, I thought it would be more convenient to have a single test to avoid slowing down the test suite even more.

  2. Compared to what I did in OmegaConf, I'm not supporting interpolations as dictionary keys in Hydra's overrides grammar. This is because dictionaries in Hydra are meant to be merged into DictConfigs, and allowing interpolations would raise several questions that I believe are better left for later (should we resolve interpolations before or after merging overrides? do we allow keys to change names dynamically?).

  3. As in my OmegaConf PR, I renamed dictValue and listValue to dictContainer and listContainer, to avoid any confusion that may arise due to the new dictKey parser rule. However, this change impacts more things in Hydra than it did in OmegaConf. Let me know if that's still ok (edit: I should add that there remain some DictValues and ListValues in tests that I didn't change, as I wasn't entirely sure if the "value" there was directly related to the name of the rules)

  4. When implementing equality for quoted strings, it made sense to me to ignore the type of quotes in __eq__. It also didn't break any existing test. Just wanted to flag it here in case it might be a concern.

  5. In this PR the grammar allows non-string dictionary keys like int / float / bool, but in practice trying to use them will not work since OmegaConf does not support them currently. Not sure if I should keep them to be "future proof", or just remove them for now.

  6. I assumed this would make it into 1.0.x and thus I updated the 1.0 doc as well. If that's not the case let me know and I'll revert this change.

  7. I didn't add a news item yet as it didn't really seem worth one on its own (the ability to use non-string keys would be more meaningful to advertise IMO)

Test Plan

pytest (see new tests in the diff)

Related Issues and PRs

Related to omry/omegaconf#445

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 14, 2020
hydra/core/override_parser/types.py Outdated Show resolved Hide resolved
hydra/core/override_parser/types.py Show resolved Hide resolved
hydra/core/override_parser/overrides_visitor.py Outdated Show resolved Hide resolved
@@ -1047,6 +1047,29 @@ def test_run_pass_list(self, cmd_base: List[str], tmpdir: Any) -> None:
ret, _err = get_run_output(cmd)
assert OmegaConf.create(ret) == OmegaConf.create(expected)

def test_multirun_dict_keys(self, cmd_base: List[str], tmpdir: Any) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test, and test_overrides_dict_keys - are very high level.
you have some low level tests in test_overrides_parser and you are jumping straight into testing as processes.

If I understand the purpose of those tests correctly (and it's hard because they are so high level), I think should should be replaced by lower level config composition test in test_config_loader.py.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, to clarify the motivation: those two new tests are meant to test the changes in types.py (which are not tested by the low level grammar tests)

The second one (test_overrides_dict_keys) should probably indeed be a lower level test. I wasn't familiar with the various Hydra tests and couldn't figure out where the basic override functionalities were being tested. I just moved it to test_config_loader.py as you suggested: see 6bcaac2, let me know if that makes more sense this way. The main thing that needs to be tested is quoted strings as keys, but I thought it wouldn't hurt to add tests for all other kinds of string formatting that we may expect, just in case (also later on, we should also test other types of keys like int / float / bool if we support them).

I'm not sure I can move test_multirun_dict_keys to this same file though. This test specifically tests the change to _get_value_element_as_str() (l. 429 of types.py in the current diff of this PR), which is used in sweeps to provide the proper command line overrides. Again, the main thing that needs testing are quoted strings, but I added the other key formats as well to be safe.
Maybe to clarify the purpose of this test, I can show you how it fails if I revert the l.429 change:

LexerNoViableAltException: +foo={QuotedString(text='null', quote=<Quote.single: 0>):0}

(this is because the sweeper would not replace the QuotedString with its actual quoted string representation in the command line override)

Side note: I also pushed 0b53209 which removes some comments that seemed irrelevant to me (probably a copy/paste leftover from another comment you can see below them)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great.
There is a test in test_overrides_parser that is dedicated for that logic: test_override_get_value_element_method

I think testing there should be sufficient.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think testing there should be sufficient.

Perfect! Done in 8084317

At the same time I uncovered a bug with escaped characters (that affected regular unquoted strings as well, not just dictionary keys), which is now fixed in that same commit.

@omry
Copy link
Collaborator

omry commented Dec 16, 2020

  1. I wasn't entirely sure where to put this test: https://github.com/odelalleau/hydra/blob/ccde9814d66042f4c6b9fe7ec3db77e253297265/tests/test_hydra.py#L1190 . It's important to have it in addition to the grammar tests in test_overrides_parser because it also tests the config merge (in particular, it matters for quoted strings). Also ideally this would have been a parameterized test, but since integration_test() is slow, I thought it would be more convenient to have a single test to avoid slowing down the test suite even more.

I commented about the high level tests.

  1. Compared to what I did in OmegaConf, I'm not supporting interpolations as dictionary keys in Hydra's overrides grammar. This is because dictionaries in Hydra are meant to be merged into DictConfigs, and allowing interpolations would raise several questions that I believe are better left for later (should we resolve interpolations before or after merging overrides? do we allow keys to change names dynamically?).

I suspect OmegaConf dict keys should also not support interpolation for the same reasons.

  1. As in my OmegaConf PR, I renamed dictValue and listValue to dictContainer and listContainer, to avoid any confusion that may arise due to the new dictKey parser rule. However, this change impacts more things in Hydra than it did in OmegaConf. Let me know if that's still ok (edit: I should add that there remain some DictValues and ListValues in tests that I didn't change, as I wasn't entirely sure if the "value" there was directly related to the name of the rules)

Sounds good. I saw that you actually did rename dictValue to dictContainer in the tests. anything else you left as is?

  1. When implementing equality for quoted strings, it made sense to me to ignore the type of quotes in __eq__. It also didn't break any existing test. Just wanted to flag it here in case it might be a concern.

Commented inline.

  1. In this PR the grammar allows non-string dictionary keys like int / float / bool, but in practice trying to use them will not work since OmegaConf does not support them currently. Not sure if I should keep them to be "future proof", or just remove them for now.

Does they work when sweeping over dictionaries?

--multirun foo={a:10},{10:a}
  1. I assumed this would make it into 1.0.x and thus I updated the 1.0 doc as well. If that's not the case let me know and I'll revert this change.

Commented line, revert. this is for 1.1.

  1. I didn't add a news item yet as it didn't really seem worth one on its own (the ability to use non-string keys would be more meaningful to advertise IMO)

I think we need a news fragment, but first we need to pin point exactly what this is enabling.

@odelalleau
Copy link
Collaborator Author

Thanks for the review & comments!

  1. Compared to what I did in OmegaConf, I'm not supporting interpolations as dictionary keys in Hydra's overrides grammar. This is because dictionaries in Hydra are meant to be merged into DictConfigs, and allowing interpolations would raise several questions that I believe are better left for later (should we resolve interpolations before or after merging overrides? do we allow keys to change names dynamically?).

I suspect OmegaConf dict keys should also not support interpolation for the same reasons.

Currently dictionaries in OmegaConf's grammar are used for a different purpose though: they are only inputs to resolvers. So these questions don't actually arise.
I can drop support for interpolations as keys in OmegaConf though if you prefer (it's always possible to replace a dictionary with a list of pairs if someone really needs them).

Sounds good. I saw that you actually did rename dictValue to dictContainer in the tests. anything else you left as is?

Yeah, so if you search for DictValues and ListValues (with an "s"), you'll notice some classes in tests with these names. I wasn't 100% sure I should rename them (mostly because of the "s", which threw me off, and I didn't try to fully understand what this was testing). Happy to rename them as well if you tell me I should!

  1. In this PR the grammar allows non-string dictionary keys like int / float / bool, but in practice trying to use them will not work since OmegaConf does not support them currently. Not sure if I should keep them to be "future proof", or just remove them for now.

Does they work when sweeping over dictionaries?

--multirun foo={a:10},{10:a}

Why would they?

> python my_app.py +foo="{a:10},{10:a}" -m                                                                                 1 ↵  2.48 L

Error merging override +foo={10:a}
Incompatible key type 'int'
        full_key: foo.10
        reference_type=Any
        object_type=dict

Note that I used +, because anyway I can't create a config with an existing foo that would have 10 (the integer) as key.

I think we need a news fragment, but first we need to pin point exactly what this is enabling.

If we merge it in current master, it allows command line overrides with dictionaries whose keys are strings in "non ID" format (see test_dict_key_formats() for examples). Currently if you have a key in your config whose name is not an ID, you can't use it as a dictionary key in your command line overrides.

In the future, it should allow also command line overrides with dictionaries whose keys are not strings (ex: int, float, bool). But that can only happen after OmegaConf supports those.

hydra/core/override_parser/types.py Outdated Show resolved Hide resolved
hydra/core/override_parser/types.py Show resolved Hide resolved
@@ -1047,6 +1047,29 @@ def test_run_pass_list(self, cmd_base: List[str], tmpdir: Any) -> None:
ret, _err = get_run_output(cmd)
assert OmegaConf.create(ret) == OmegaConf.create(expected)

def test_multirun_dict_keys(self, cmd_base: List[str], tmpdir: Any) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great.
There is a test in test_overrides_parser that is dedicated for that logic: test_override_get_value_element_method

I think testing there should be sufficient.

Also made the `QuotedString` dataclass immutable for hash safety.
Also improved the corresponding tests for dictionary keys.
@odelalleau
Copy link
Collaborator Author

Side note: I removed a print statement that seemed out of place in fc87464

hydra/_internal/grammar/utils.py Outdated Show resolved Hide resolved
hydra/core/override_parser/types.py Outdated Show resolved Hide resolved
hydra/core/override_parser/types.py Outdated Show resolved Hide resolved
hydra/core/override_parser/types.py Show resolved Hide resolved
tests/test_config_loader.py Outdated Show resolved Hide resolved
tests/test_config_loader.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@omry omry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a couple of small comments.

hydra/core/override_parser/overrides_visitor.py Outdated Show resolved Hide resolved
hydra/core/override_parser/types.py Outdated Show resolved Hide resolved
@omry
Copy link
Collaborator

omry commented Dec 19, 2020

Looking good @odelalleau, I am ready to merge it.

@omry
Copy link
Collaborator

omry commented Dec 19, 2020

@omry omry merged commit 7224465 into facebookresearch:master Dec 20, 2020
@fuhuifang
Copy link

Hi, how to parse something like key={/-\\+.$%*@: 1} from command line? I'm getting "error: argument --info/-i: ignored explicit argument 'nputFormat'".

I tried a simpler dictKey like key={a:1} and it works.

@odelalleau
Copy link
Collaborator Author

Hi, how to parse something like key={/-\\+.$%*@: 1} from command line? I'm getting "error: argument --info/-i: ignored explicit argument 'nputFormat'".

Using single quotes 'key={/-\\+.$%*@: 1}' seems to be working for me. That being said, note that the support for quoted strings in keys (e.g., 'key={"a^b": 1}') that was introduced in this PR has eventually been reverted in Hydra 1.1, so you are limited by the set of (possibly escaped) characters supported in unquoted strings.

@fuhuifang
Copy link

Hi, how to parse something like key={/-\\+.$%*@: 1} from command line? I'm getting "error: argument --info/-i: ignored explicit argument 'nputFormat'".

Using single quotes 'key={/-\\+.$%*@: 1}' seems to be working for me. That being said, note that the support for quoted strings in keys (e.g., 'key={"a^b": 1}') that was introduced in this PR has eventually been reverted in Hydra 1.1, so you are limited by the set of (possibly escaped) characters supported in unquoted strings.

Thanks so much for the quick response. Single quotes doesn't work for me. I just found a working solution: "key='{/-\\+.$%*@: 1}'".

@odelalleau
Copy link
Collaborator Author

Single quotes doesn't work for me. I just found a working solution: "key='{/-\\+.$%*@: 1}'"

Hmm, I think either:

  • Your shell doesn't handle single quotes the same as mine (I'm using zsh), or
  • You are not using (a release candidate of) Hydra 1.1

The solution you found gives a different result: it sets key to the string "{/-\\+.$%*@: 1}" instead of it being a dictionary.

To check your shell behavior, you can do:

> python -c "import sys; print(sys.argv[1])" 'key={/-\\+.$%*@: 1}'
key={/-\\+.$%*@: 1}

@fuhuifang
Copy link

fuhuifang commented May 18, 2021

Single quotes doesn't work for me. I just found a working solution: "key='{/-\\+.$%*@: 1}'"

Hmm, I think either:

  • Your shell doesn't handle single quotes the same as mine (I'm using zsh), or
  • You are not using (a release candidate of) Hydra 1.1

The solution you found gives a different result: it sets key to the string "{/-\\+.$%*@: 1}" instead of it being a dictionary.

To check your shell behavior, you can do:

> python -c "import sys; print(sys.argv[1])" 'key={/-\\+.$%*@: 1}'
key={/-\\+.$%*@: 1}

This works for me, but Hydra 1.1 is not compatible with our code. Do you know it will be released?

@odelalleau
Copy link
Collaborator Author

This works for me, but Hydra 1.1 is not compatible with our code. Do you know it will be released?

There are indeed a bunch of changes from 1.0 to 1.1 you'll need to take care of (you can start here: https://hydra.cc/docs/next/upgrades/intro)

AFAIK the official release of 1.1 is still a few weeks from now. You can give the RC1 a shot (see release notes at https://github.com/facebookresearch/hydra/releases/tag/v1.1.0.rc1) but there might still be a few rough edges, so don't use it in a critical application just yet.

If you want to stick to 1.0 I'm afraid there aren't many options -- you might be able to hack something around with a custom resolver but the 1.0 resolvers are a bit limited so it probably wouldn't be great. A simpler temporary hack (until you upgrade to 1.1) could be to replace whatever character that is not allowed in 1.0 with _ and manually replace it at the very beginning of your program... please don't keep such a hack for too long though ;)

@omry
Copy link
Collaborator

omry commented May 18, 2021

At this point I recommend that you migrate to Hydra 1.1 by installing the first release candidate and fixing all the warnings.
In general it should be pretty smooth. If you run into a wall feel free to file an issue with a question (ideally with a minimal repro).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants